DOCUMENT RESUME 



ED 462 905 



HE 034 782 



AUTHOR 

TITLE 

INSTITUTION 
REPORT NO 
PUB DATE 
NOTE 

AVAILABLE FROM 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



Hoyt, Donald P.; Pallett, William H. 

Appraising Teaching Effectiveness: Beyond Student Ratings. 
IDEA Paper. 

Kansas State Univ. , Manhattan. IDEA Center. 

IDEA- 3 6 
1999-11-00 
8p • 

Kansas State University, IDEA Center, Inc., 1615 Anderson 
Avenue, Manhattan, KS 66502-4073. Tel: 800-255-2757 (Toll 
Free); Tel: 785-532-5970; Fax: 785-532-5637; e-mail: 

IDEA@ksu . edu. 

Guides - Non-Classroom (055) 

*College Faculty; Evaluation Methods; Higher Education; 

* Instructional Effectiveness; *Student Evaluation of Teacher 
Performance; *Teacher Effectiveness; *Teacher Evaluation 



ABSTRACT 



Evaluating faculty effectiveness is important in 
institutions of higher education. Although evaluation is inherently 
threatening to most faculty members, the vast majority take their assignments 
seriously and want to conduct them as effectively as possible. Assessing 
faculty performance is a complex and time-consuming process. If it is done 
poorly or insensitively, it can have an adverse effect on institutional 
quality. Whether or not individual institutions elect to commit the resources 
required for valid evaluations depends on the degree to which they agree with 
these propositions: (1) all members of the institution should be accountable 

for their activities and performance; (2) the conduct and use of credible 
evaluation programs have an important influence on the welfare and future 
excellence of the individual, the department, and the institution; and (3) 
when improvement efforts are supported by institutional policy and guided by 
comprehensive and valid appraisals of current functioning, the well-being of 
the individual and the institution are affected positively. (Contains 18 
references.) (SLD) 



Reproductions supplied by EDRS are the best that can be made 
from the original document. 



ED 462 905 






Appraising Teaching Effectiveness: 
Beyond Student Ratings 

By Donald P. Hoyt, William H. Pallett 

IDEA Paper #36 

IDEA CENTER 



permission to reproduce and 
disseminate this material has 

BEEN GRANTED BY 

W.eoshn 



U.S. DEPARTMENT OF EDUCATION 
Office of Educational Research and Improvement 
EDUCATIONAL RESOURCES INFORMATION 
J CENTER (ERIC) 

□''mis document has been reproduced as 
received from the person or organization 
originating it. 



□ Minor changes have been made to 
improve reproduction quality. 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) 



Points of view or opinions stated in this 
document do not necessarily represent 
official OERI position or policy. 




2 



BEST COPY AVAILABLE 




Zs\[p[p[?S]B@8[]Qg) TfeSGCoDDOgJ B(?(?@GC’DW@[JQ@gS§ @©^y®OD(°] 

SOODdlSDOC 1 [SO'u'DDDgJg 

Donald P. Hoyt and William H. Pallett 
IDEA Center 



Evaluating faculty effectiveness is important in nearly every 
institution of higher education. Assessing the effectiveness with 
which various functions are performed is essential to a variety 
of important administration recommendations and decisions. 

It also provides feedback which influences the faculty 
member's self-image and professional satisfaction. And it 
establishes a climate which communicates the institution's ' 
commitment to professional improvement and confidence that 
every faculty member will make a valuable contribution to the 
achievement of shared goals. 

There are two types of contributions faculty members make to 
the programs of a department/institution — indirect and direct. 
Indirect contributions, while not impacting directly upon the 
achievement of a program's objectives (principally student 
learning , in the case of instruction ; new insights and know /- 
edge , in the case of research ; assistance to clients in the case 
of service ) make a difference to the program's success by 
affecting the environment of the department/division, the 
appropriateness and quality of its plans, and the attitudes and 
skills of other members. Direct contributions are those in 
which the achievement of program goals is impacted by the 
individual's personal intervention or involvement. 

In many institutions, research and service programs are vitally 
important. Assessing a faculty member's contributions to each 
constitutes a serious challenge. However, this paper is con- 
cerned only with the instructional program, a focus which is 
central to tne mission of almost every institution of higher 
education. 

Assessing) l{Hfe<sfiw©[ri)®ss 

Direct contributions to the instructional program. Most 
institutions employ a "student rating" system to assist in the 
evaluation of instruction. Obtaining student feedback is not 
only a relatively simple procedure but also is one which has 
considerable credibility for several reasons. (1 ) Input is 
received from a number of raters so that reliability is usually 
quite high. (2) Ratings are made by those who have consis- 
tently observed the teacher over many hours, so that they are 
based on representative behavior. (3) Observations about 
student learning, the object of instruction, are made by those 
who have been personally affected and therefore have high 
^ face validity. An enormous volume of research supports tne 
So credibility and validity of student ratings (Aleamoni, 1 981 ; 
Cashin, 1 995; Braskamp & Ory, 1 994). 

O 



On the other hand, student rating systems have several 
important limitations. (1 ) Some of them are poorly constructed 
ask questions about matters which are unrelated to student 
earning; employ words with unclear meanings; double- 
barreled questions; response alternatives which fail to exhaust 
the possibilities; etc.). (2) In some instances, administrative 
procedures have not been standardized, so that results are not 
comparable from one faculty member to the next. (3) Some 
systems fail to take into account extraneous influences (factors 
which influence ratings but which are beyond the instructor's 
control, such as class size, student academic motivation, or 
course/disciplinary difficulty). (4) Technical and statistical 
support is lacking for some systems, so that interpretation of 
results is problematic. 

Even when these potential difficulties are adequately ad- 
dressed, authorities are agreed that there are a number of 
important matters related to teaching effectiveness for which 
students are unqualified to provide valid reports. Cashin 
(1989) lists 26 specific considerations which he regards as 
relevant to instructional effectiveness; students are unqualified 
to provide valid observations for 1 1 of these, including an 
array of factors related to subject matter mastery, course 
design, and curriculum development. Similarly, Cohen and 
McKeachie (1980) identified 10 criteria of teaching effective- 
ness which colleagues, but not students, could assess, two of 
which describe indirect influences (commitment to teaching 
and support for departmental efforts). Keig and Waggoner 
(1994) synthesized the Cohen and McKeachie criteria into 
three general features of teaching effectiveness which students 
are unable to judge validly: (1 ) the goals, content, and 
organization of course design, (2) methods and materials used 
in delivery, and (3) evaluation of student work, including 
grading practices. 

There is a general consensus that students are unable to judge 
such vital matters as currency of course content or the degree 
to which it provides a representative (as opposed to biased) 
view of the subject matter. Nor can they judge the clarity, 
comprehensiveness, or realism of objectives, tne degree to which 
readings and other assignments are (balanced and appropriate, 
the validity of procedures for assessing student achievement, or 
the degree to which grading standards are in line with the 
department's or institution's expectations or policies. 




How should the gaps created by shortcomings in student 
ratings be closed? A wide variety of suggestions have been 
made. Most frequently cited are self-reports, colleague 
ratings, and ratings by department heads/chairs. 

Seldin (1999) has recently reviewed the value and limitations 
of self-reports. Clearly, self-interest limits the use of these in 
the evaluation of teaching effectiveness for administrative 
purposes. But a reflective analysis on the part of the instructor 
can be instrumental in promoting instructional improvement 
(Braskamp & Ory, 1 994). In addition, the instructor is the 
only person who can supply certain kinds of information 
needed by those charged with making such evaluations, 
including information about course objectives; readings, 
assignments, and other learning activities; the creation of 
instructional materials or learning opportunities; procedures 
for appraising student achievement; results of, and course 
modifications based on, classroom research and other faculty 
efforts directed to improving instructional skills. Such reports 
are commonly made througn the faculty member's annual 
report of instructional efforts; an illustrative outline for making 
such reports is available on the Center's web page 
(www.ideo.ksu.edu) or, in hard copy form, from the Center- 
request Appendix A. The annual report can be useful in 
developing the faculty member's teaching portfolio (Seldin, 1 993; 
1 997; Zubizarreta, 1 999), a device for organizing relevant 
information for both appra/sa/and improvement purposes. 

Most of the crucial features of instruction which students are 
not qualified to judge can, under certain circumstances, be 
assessed by faculty colleagues. How this is best done is 
controversial. Centra (1993) has summarized research 
related to peer classroom observation and concluded that, as 
currently practiced, such observations are neither reliable nor 
valid. On the other hand, DeZure (1999) has identified seven 
steps which can be taken to overcome these shortcomings, 
including the use of multiple observations and observers, the 
training of observers, and the employment of a validated 
observation instrument. Examples of such instruments are 
found in Seldin's (1 999) recent book; observers are expected 
to rate such factors as knowledge of the subject, enthusiasm, 
sensitivity to the class' level of knowledge, preparation and 
organization, and clarity of presentation 1 . 

Although colleagues may be able to assess such factors with 
acceptable reliability ana validity, impositions on faculty time 
makes such a process unrealistic on many campuses. DeZure 
points out that, besides a 2-4 hour training commitment, 
colleagues must be prepared to spend about four hours in 
each observation (pre-observation meeting of 30 minutes, 

30 minutes to review materials, 75 minutes to observe, 

60 minutes to prepare a joint report, and 45 minutes in a 
post-meeting with the instructor). 

There is reason to believe that such an extensive time commit- 
ment may be necessary when classroom observation is 
geared to instructional improvement (see Summafive and 
Formative Purposes, pp. 3-4). However, when the purpose is 
primarily to arrive at a summary estimate of teaching effec- 
tiveness, the rating of relevant materials is an attractive 
alternative to classroom visitation. In general, three raters are 
asked to make independent judgments based on these materi- 
als and then, through discussion, arrive at a consensus 2 . If 
these ratings are guided by a carefully developed instrument, 
the consensus rating will usually possess satisfactory reliability 




for expediting this process, are available on the Center's web 
site; for hard copy, request Appendix B from the Center. 

The department or division head/chair is responsible for 
gathering and synthesizing all evaluation information. Time 
constraints imposed by classroom visitations or an in-depth 
review of instructional materials will generally prevent this 
administrator from making a personal assessment of direct 
contributions to the instructional program. But he/she can be 
an important source of information in assessing both the 
faculty member's scholarly excellence and his/her indirect 
contributions (see the following section). The head/chair is also 
in a better position than anyone else to judge the degree of 
professional responsibility exhibited by the faculty member 
through such activities as submitting grades, communicating 
text/library needs, pursuing professional development oppor- 
tunities, conducting classroom research, and developing 
innovative instructional materials or opportunities, all of which 
are relevant to the achievement of excellence in the instruc- 
tional program. 

A form for guiding the head/chair's review of instruction is 
included on the Center's web site; for a hard copy, request 
"Appendix D" from the Center. 

Indirect contributions to the instructional program . There is 
general agreement that a department's/institution's "productiv- 
ity" (success in achieving its goals and objectives) is affected 
by such matters as "faculty morale", "collegiality", and "faculty 
vitality". However, little attention has been paid to the respon- 
sibility of individual faculty members for contributing to these 
"facilitative features" of the academic environment. 

In terms of the instructional program, there are at least three 
types of indirect contributions which individual faculty members 
can make. 

1 . The general learning environment. Through their social 
and professional demeanor, faculty members influence the 
"climate" of the department — its openness, objectivity, 
tolerance of ambiguity, etc. In their interaction with depart- 
mental colleagues, faculty members who share teaching 
ideas, express interest in the instructional work/concerns of 
others, or who regularly model intellectual curiosity and 
excitement make contributions to the learning environment 
which almost certainly will indirectly affect student learning 
in a positive manner. 

2. Course and curricular development. Course revision/ 
updating and the development of new instructional materials or 
learning aids are two ways of making indirect contributions to 
student learning. Keeping abreast or instructional/curricular 
innovations ana sharing these with colleagues can make a 
similar contribution. Likewise, faculty members who are 
actively involved in the curriculum revision process and who 
explore with colleagues ways to improve the integration/ 
articulation among specific courses can be expected to have a 
positive impact on student achievement. 



'Some of these (e. g., enthusiasm; clarity of presentation) represent factors 
which students are able to assess with reasonable accuracy and may 
therefore be excluded from colleague ratings (unless there is a need or desire 
to confirm student ratings). 

2 Small campuses often employ only one or two faculty members in each 
discipline, cooperative arrangements with other institutions may make it 
feasible to obtain ratings from those in the same discipline. In such 
instances, consensus can be achieved through mail or telephone consultation. 



represents the institution's judgment of the relative importance 
of each evaluation source under ideal conditions; (b) if infor- 
mation from a given source is unavailable, all faculty members 
be given a rating equal to the average of ratings from other 
sources; and (c) if information from a given source is believed 
to be of marginal validity, create two ratings for each faculty 
member — one using the "marginal" process, and one equal to 
the average rating Tor all faculty members from other sources. 
These suggestions are intended to ensure that the final evalua- 
tion figure is not unduly affected by a given source, regardless 
of how sound that source may be. They will inevitably reduce 
the degree to which evaluations differentiate between the 
"best" and "worst" teachers; but this is believed to be 
preferable to an over-reliance on single sources (student 
ratings; colleague ratings; etc.) 10 . 

cry 

Although evaluation is inherently threatening to most faculty 
members, the vast majority take their assignments seriously 
and have a sincere desire to conduct them as effectively as 
possible. When the departmental environment is characterized 
by a strong commitment to mission, mutual respect and trust, 
and administrative support for faculty, a sound evaluation 
program can play a vital role in promoting both individual and 
organizational excellence. 

Assessing faculty performance is a complex and time-consum- 
ing process. If it is done poorly or insensitively, it can have an 
adverse effect upon institutional quality. Whether or not 
individual institutions elect to commit the resources which valid 
evaluations require depends upon the degree to which they 
agree with three propositions: 

1 . All members of the institution should be accountable for 
their activities and performance. 

2. The conduct and utilization of credible evaluation pro- 
grams have an important influence on the welfare and 
Future excellence of the individual, the department, and 
the institution. 

3. When improvement efforts are supported by institutional 
policy and guided by comprehensive and valid appraisals 
of current functioning, the well-being of the individual and 
of the institution are positively affected. 



,0 By employing "leveling" to deal with incomplete or unreliable data, the 
institution risks an inadvertent alteration of the priorities it assigns to 
"instruction", "research", and "service". For example, if ratings of instruc- 
tional effectiveness differentiate only slightly among faculty, while ratings of 
"research" or "service" effectiveness vary widely, tne latter will automatically 
have an increased impact on overall evaluations. Special procedures to 
protect against such unintended effects are needed. 




IDEA Center 

1615 Anderson Avenue 

Manhattan, KS 66502-4073 

Tel: 800.255.2757 or 785.532.5970 

Fax: 785.532.5637 

E-mail: IDEA@ksu.edu 

©1999, IDEA Center 

8 



traces 

Aleamoni, L. M. (1981). "Student Ratings of Instruction". In J. 
Millman (Ed.), Handbook of teacher evaluation, 1 10-145. 
Beverly Hills, CA: Sage. 

Bernstein, Daniel (1996). "A Department System for Balancing 
the Development and Evaluation of College Teaching: A 
Commentary on Cavanagh". Innovative Higher Education , 
20, No. 4, 241 -247. 

Braskamp, L. A. & Ory, J. C. (1994). Assessing faculty work: 
Enhancing individual and institutional performance. San 
Francisco: Jossey-Bass. 

Brinko, K. T. ( 1 991 ). "The Interactions of Teaching Improve- 
ment." In M. Theall & J. Franklin, (eds.), Effective practices 
for improving teaching. New Directions for Teaching and 
Learning, No. 48, San Francisco: Jossey-Bass. 

Cashin, W. E. (1989). "Defining and Evaluating College 
Teaching", IDEA Paper No. 21, IDEA Center, Kansas State 
University. 

Cashin, W. E. (1995). "Student Ratings of Teaching: The 
Research Revisited,", IDEA Paper No. 32, IDEA Center, 
Kansas State University. 

Chickering, A. W. and Gamson, Z. (1987). "Seven Principles 
of Good Practice in Undergraduate Education". American 
Association for Higher Education Bulletin , 39, 3-7. 

Cohen, P. A. (1980). "Effectiveness of Student Rating Feed- 
back for Improving College Instruction: A Meta- Analysis of 
Findings." Research in Higher Education, 13, 321-341. 

Cohen, P. A. & McKeachie, W. J. (1980). "The Role of 
Colleagues in the Evaluation of Teaching". Improving 
College and University Teaching , 28, 1 47-1 54. 

DeZure, D. (1999). "Evaluating Teaching Through Peer 
Classroom Observation". In Seldin, P. & Associates, 
Changing practices in evaluation teaching, pp. 70-96, 
Bolton, MA: Anker. 

Hutchings, P. (Ed.) (1995). From idea to prototype: The peer 
review of teaching: A project handbook. Washington, D. 

C. : American Association for Higher Education. 

Keiz, L. & Waggoner, M. D. (1994). Collaborative peer 
review: The role of faculty in improving college teaching. 
ASHE-ERIC Higher Education Report, No. 2. Washington, 

D. C.: The George Washington University, Graduate 
School of Education and Human Development. 

Seldin, P. & Associates (1 993). Successful use of teaching 
portfolios. Bolton, MA: Anker. 

Seldin, P. (1997). The teaching portfolio: A practical guide to 
improved performance and promotion/tenure decisions (2 nd 
ed.). Bolton, MA: Anker. 

Seldin, P. (1999). "Self-evaluation: What Works? What 
Doesn't?" In Seldin, P. & Associates, Changing practices in 
evaluating teaching, 97-1 1 3. Bolton, MA: Anker. 

Sheppard, S. D., Johnson, M., & Leifer, L. (1998). "A Model 
of Peer and Student Involvement in Course Assessment." 
ASEE Journal of Engineering Education, 87 (4), 349-354. 

Weimer, Maryellen (1990). Improving College Teaching. San 
Francisco: Jossey-Bass, 1 990. 

Wright, W. Alan and Associates (1995). Teaching Improve- 
ment Practices. Bolton, MA: Anker Publishing Co. 



3. Improving teaching effectiveness of others. Indirect contri- 
butions to student learning are made when faculty members 
consult with each other on teaching methods or strategies or 
exchange classroom visits for purposes of offering constructive 
critiques. Similar contributions are made by sharing with 
colleagues information about an innovative assessment method 
or a new experiential component to a course. In departments 
which employ graduate teaching assistants or temporary 
faculty, indirect contributions can be made by those who offer 
advice or other assistance to their less experienced colleagues. 

In most cases, the academic department chair/head is in a 
good position to judge the indirect contributions of individual 
faculty members. But it is desirable to obtain additional 
evidence by polling the teaching faculty. While not every 
participant will be able to judge the contributions of every 
faculty member, it is important that all who are able to make 
relevant observations be asked to do so. 

A form for collecting such views is included on the Center's 
web site; for a hardcopy, request Appendix C from the Center. 

Swmm<aflifofe eooudl [F®innraeaflD^@ tPw[p®§©§ 

Authorities in educational evaluation have traditionally distin- 
guished between summative and formative evaluation. The 
former is done as an aid to administrative decision-making; 
the latter focuses on using evaluative information to improve 
performance. 

Administrative recommendations/ decisions . There are four 
inter-related administrative decisions or recommendations for 
which conclusions about the individual's teaching effectiveness 
are important. 

1 . On the assumption that those who are most successful in 
their assignments should receive the largest salary increments, 
many institutions have adopted a "merit increase" policy. At 
such institutions, the faculty member's merit evaluation is based 
in part upon the evaluation of his/her contribution to the 
instructional program. 3 

2. For non-tenured faculty, decisions must be made annually 
with respect to retention. Unless the evaluation of teaching 
effectiveness suggests that the faculty member meets, or will be 
able to meet, the standards for acquiring tenure, it is not in the 
best interest of the institution to retain the faculty member; in 
such instances, retention is not in the best interests of the 
faculty member either, although this may be difficult to accept. 

3. At most institutions, a decision about awarding tenure must 
be made after a period of time (usually six years). Such a 
decision has critical implications for both the department's 
fiscal status and its long term quality. Because instruction is a 
vital function, the tenure policy at most institutions is intended 
to insure that it will be continuously performed at a high level 
of quality. 4 

4. Most institutions accord a "rank" to faculty members. 
Presumably, those of higher rank are more valuable to the 
institution (contribute more to the achievement of its mission) 
than those of lower rank. Policies with respect to rank often 
involve considerations beyond an assessment of effectiveness 
in performing instructional, research, and service assign- 
ments 5 . Nonetheless, those at a given rank are expected to 
conduct their assignments with acceptable levels of success. 
Tk^f ore/ evaluation of professional effectiveness is essential. 




Evaluations whose purpose is exclusively "summative" (to aid 
in making administrative recommendations or decisions) 
should focus on outcomes. A central question is, "How 
successfully were the objectives of the course addressed?" 6 
Other outcomes may also be relevant; i. e., the production of 
innovative learning materials; the introduction of creative 
projects or other extra-class assignments; the conduct of 
classroom research which tests instructional hypotheses; etc. 
While all evaluations should be conducted carefully and 
thoroughly, those whose purpose is summative ask only, "To 
what degree did the instructor have a favorable impact on 
outcomes pertinent to the goals of the instructional program?"; 
they need not gather information about techniques, strategies, 
or plans which were responsible for this effect. 

Recommendations based upon summative evaluations are 
extremely serious. They affect both the lives of individual 
faculty members and the welfare of the department (and, 
ultimately, the institution). Therefore, they should be done with 
great care. Those with respect to rank and tenure are especially 
vital since it is nearly impossible to correct a poor decision. 

Special care should be taken to ensure that the summative 
evaluations used to support such decisions are based on a 
representative and comprehensive review of the faculty 
member's contributions. In terms of the instructional function, 
this means that (a) evidence of effectiveness should be avail- 
able for every course the faculty member has taught (although 
not necessarily for each term), (b) the evaluation should be 
based on a cumulative record of the faculty member's teaching 
effectiveness (usually involving a minimum of six classes); and 
(c) trends in teaching effectiveness (improvement, steady-state, 
decline) can be detected. 

Improving performance . In contrast to the limited focus of 
summative evaluation, formative evaluation requires much 
more information. Not only is it necessary to assess the 
instructor's impact (positive or negative) on outcomes, but also 
to examine characteristics of the instructor which account for 
this impact. 

It is not necessary to obtain formative evaluations of every 
course each time it is taught. In fact, experience suggests that 
instructional improvement is best facilitated by concentrating 
not only on one course at a time but also on a limited number 



3 The amount of influence which this assessment has on the overall merit 
evaluation is usually determined by a statement describing all faculty 
responsibilities ana the relative importance of each. In some departments, 
the same relative importance of teaching, research, and service is assigned 
to every faculty member; in others, these assignments differ among faculty. 

4 Tenure criteria and standards vary among institutions. Almost all require an 
evaluation of how effectively the faculty member contributes to the depart- 
ments programs. Many also include assessments of matters not considered 
in this paper, such as contribution to departmental diversity, cohesion, and 
collegialily or evidence that the faculty member will continue to grow in 
vitality ana professional sophistication. 

traditionally, the level of difficulty or complexity of professional assignments, 
whether in teaching, research, or service, differentiates among ranks. In the 
instructional area, those of highest rank are usually expected to be the most 
versatile in terms of the variety of courses they can offer; frequently, they 
provide advanced and specialized courses wnich those of lower rank are not 
yet qualified to teach. 

6 Student learning is affected by many matters, including the motivation of 
enrollees to learn, the adequacy of tneir background, and their academic 
habits and skills. Since faculty evaluation is concerned with the contribution 
the faculty member made to student learning, it is desirable to exclude (take 
into account) the contribution made by such "extraneous" influences. For this 
reason, the IDEA system provides "adjusted" ratings. 

5 



